home *** CD-ROM | disk | FTP | other *** search
- Spelling Checker
- ----------------
-
- BACKGROUND:
- ----------
- The spelling checker package is made up of three files. Each
- file name is listed below, along with a brief explanation of what
- the file is.
-
- SPELL.COM = The executable code for the spelling checker.
- DICTION.ARY = A file of many correctly spelled words.
- (Currently contains 21,082 words.)
- SPELL.DOC = Brief documentation file that gives
- instructions and hints on the use of SPELL.
-
- In addition to the files above, each time SPELL is run it will
- create a file named SPELL.ERR. This file list the possible
- misspellings found in the input file.
-
- This is the first attempt at writing any program for release
- to the public. The program was designed to be as efficient as
- possible and fairly easy to use.
-
-
- RUNNING THE PROGRAM:
- -------------------
- At the DOS prompt enter:
-
- SPELL filename
-
- where filename is the name of the file you want to be checked.
- If the filename is not entered on the command line then you will
- be prompted for it. Assuming that you entered a valid filename
- you will see a message like the one below.
-
- Enter the drive (path) where SPELL.ERR is to be
- placed or just press RETURN for A:\
-
- In this example the default drive and directory was A:\. This
- prompt is asking you where the file of possible misspellings
- should be created. An example of a valid response to this prompt
- is C:\DOCUMENT\. If the default is ok with you then just press
- the return key. Otherwise you may enter a path name to direct
- where the file is to be created, the path name must end with the
- back slash character (\). If the DICTION.ARY file is not on the
- default drive and in the default directory, the program will
- prompt you to tell it where the file is located.
-
- By this time the program is running, during the course of the
- program you will be informed of its progress by messages similar
- to the ones shown below.
-
- Scanning the input file... completed scanning X1 of X2 blocks.
-
- Verifying the spelling!
- Reading the dictionary... completed comparing X1 of X2 blocks.
-
- In each case X2 represents the total number of blocks in the file
- (a block is 128 bytes), and X1 is the number of blocks it has
- completed at the time of display. The number X1 is updated in
- intervals of 16. This is because the program maintains an
- internal 2K buffer, upon completion of each buffer the total is
- updated. When the program completes you will see a message like
- the one below.
-
- A list of possible misspellings is in a file named SPELL.ERR
- you may list the file or print it at your convenience.
-
-
- SPELL.ERR File
- --------------
- A sample of the SPELL.ERR file is shown below. The number
- to the left of a word represents the number of occurrences of
- that word in the input file. In this case the word documentation
- was misspelled and the same mistake was made twice in the input
- file. The last line of the SPELL.ERR file tells you the total
- number of words contained in the input file (135 in this case.)
-
-
- --- Possible Misspellings ---
- *** thiswordistoolongtobeavalidword
- 2 documentaion
- **** Input file contains a total of 135 word(s).
-
- Normally there will be many possible misspellings. The first
- entry in this file is an example of what you get when a word
- exceeds the maximum length of a valid word (30 characters).
- After viewing this file you may reenter your favorite word
- processor and correct any mistakes.
-
- Note:
- 1. The spelling checker flags any word that is not in its
- dictionary, the word may be truly misspelled or it just may
- be that the word is not in the dictionary. Likewise, if you
- type the word "from" in the text when you intended to type
- the word "form" the spelling checker will not flag this as
- an error. The latter is a valid word and no spelling
- checker has the ability to read your mind.
-
- 2. The spelling checker keeps a list of unique words,
- every word is mapped into lower case for storage and
- comparison purposes, but its case (upper & lower) is also
- saved. What this means to you is that the form in which a
- misspelled word first appears in a file is the form that
- will be put into SPELL.ERR. i.e. If the words Ther, THER,
- ther, thEr and THeR appeared in the same file in that order;
- the corresponding entry in SPELL.ERR would be:
-
- 5 Ther.
-
- 3. If a word is longer than 30 characters the word is
- immediately output to SPELL.ERR with asterisks in the
- frequency field.
-
- INPUT:
- ------
- SPELL will accept either standard ASCII files or Wordstar
- files as input, the program doesn't really care. A side effect
- of this is that if the file being checked uses characters in the
- extended ASCII character set, they will be mapped into their
- Wordstar character equivalents. (If the last remark confuses
- you, just ignore it).
-
- PERFORMANCE CONSIDERATIONS:
- --------------------------
- 1. Run time is acceptable with a floppy based system,
- although a hard disk or ram disk is nice. But unless you
- have a lot of memory you will not be able to set aside much
- for a ram disk when checking large files. (I have checked a
- 80 page file before but it used slightly over 192K of
- memory.) If you can put the SPELL.ERR file on a drive
- different from the drive that has the DICTION.ARY file it
- will run somewhat faster.
-
- 2. DO NOT USE SPELL to verify a list of words that is in
- sorted order. Due to the internal data structures used this
- gives the worst case run time. The more random the ordering
- of the words, the faster the program will run.
-
- NOTE:
- -----
- Due to the scanning algorithm, something like 7B7800 will
- appear in SPELL.ERR as B7800. The algorithm just skips a
- number when it is encountered by itself, but when it is
- building a word the numbers are considered to be part of the
- word. (This was my preference. Logical?)
-
- DICTION.ARY:
- -----------
- This file is not in a standard ASCII format. The following
- gives you a brief explanation of how the file is setup in case
- you are interested. The format of the file is nSTRnSTRnSTR.....
- where n is a single byte telling how many characters to save from
- the previous word. And STR is the string to be added to the
- previous n characters. For example: the three words ape, apple
- and appreciate would be stored as
-
- 0ape2ple3reciate
-
- This storage scheme yields more than a 60% savings on disk space
- as compared to the same file in standard ASCII format. Without
- some type of compression the current dictionary file is
- approaching 200K bytes. This scheme is easily improved upon and
- it will be modified for future versions, as the number of words
- in the dictionary increases.
-
- FUTURE VERSIONS:
- ---------------
- 1. Provision for a secondary user defined dictionary.
- 2. Greater number of words in the dictionary.
- 3. Improved (more compact) storage of the dictionary.
-
- Any suggestions or comments are welcome. They may be
- directed to:
-
- Keith Miller
- 626 Timothy Dr.
- Linthicum, Md. 21090
-
- If you use this program regularly a contribution ($10 suggested)
- may be sent to the above address.
-
- DISCLAIMER NOTICE:
- -----------------
- An honest effort has been made to ensure that this product
- runs correctly and is accurate. However it is distributed on an
- as is basis with no guarantee, implied or otherwise. It may be
- distributed freely but only in its unmodified form.